Introduction

~6,000 barcoded TP53 reporters were probed in MCF7 TP53WT/KO cells and stimulated with Nutlin-3a. I previously processed the raw sequencing data, quantified the pDNA data and normalized the cDNA data. In this script, a detailed dissection of the reporter activities will be carried out to understand how TP53 drives transcription and to identify the most sensitive TP53 reporters.


Setup

Libraries


Functions


Load data


Figure 1: Characterize P53 activities per condition

Aim: I want to characterize the reporter activity distributions in the tested conditions. Does Nutlin boost P53 reporter activity and is P53 inactive in the KO cells?

## [1] 0.9685877
## [1] 0.9036356
## [1] 0.903232

Conclusion: 1F: Replicates do correlate well. 1G: Negative controls are inactive compared to P53 reporters. P53 reporters become more active in WT cells and even more active upon Nutlin stimulation.


Figure 2: Effect of affinity and binding sites + binding site positioning

Aim: How does the binding site affinity, copy number, and their respective positioning affect reporter activity?

## [1] 0.006910845

## [1] 0.02978148

## [1] 0.0005569714

Conclusion: BS006 is the most responsive to Nutlin-3a. Addition of binding sites is super-additive. Positioning of binding sites matters - putting them directly next to each other is inhibitory, and putting them close to the TSS leads to higher activity.


Figure 3: The effect of the spacer length.

Aim: Show how the spacer length between adjacent binding sites affects reporter activity.

Conclusion: Spacer length influences activity periodically. Adjacent binding sites need to be 180 degrees tilted with respect to each other to achieve optimal activation.


Figure 4: The effect of the minimal promoter and the spacer sequence.

Aim: Show how the P53 reporters interact with the two minimal promoters and the three spacer sequences.

Conclusion: Promoter and spacer sequence influence activity linearly.


Figure 5 & 6: Linear model + Selection of best reporters

Aim: Can we explain now every observation using a linear model?

## [1] 0.08400584
## MODEL INFO:
## Observations: 263 (1 missing obs. deleted)
## Dependent Variable: log2(reporter_activity)
## Type: OLS linear regression 
## 
## MODEL FIT:
## F(9,253) = 145.09, p = 0.00
## R² = 0.84
## Adj. R² = 0.83 
## 
## Standard errors: OLS
## ---------------------------------------------------------------
##                                     Est.   S.E.   t val.      p
## -------------------------------- ------- ------ -------- ------
## (Intercept)                         3.07   0.07    41.69   0.00
## promotermCMV                        1.30   0.08    15.39   0.00
## background2                        -0.89   0.08   -10.59   0.00
## background3                         0.37   0.08     4.45   0.00
## spacing_degree_transf               0.50   0.03    14.65   0.00
## affinity_id3_med_only               0.35   0.07     5.12   0.00
## affinity_id5_low_only               1.06   0.07    15.49   0.00
## affinity_id7_very-low_only          0.48   0.07     7.03   0.00
## promotermCMV:background2            0.38   0.12     3.19   0.00
## promotermCMV:background3           -0.82   0.12    -6.95   0.00
## ---------------------------------------------------------------

## MODEL INFO:
## Observations: 259 (5 missing obs. deleted)
## Dependent Variable: log2(reporter_activity)
## Type: OLS linear regression 
## 
## MODEL FIT:
## F(9,249) = 158.00, p = 0.00
## R² = 0.85
## Adj. R² = 0.85 
## 
## Standard errors: OLS
## ---------------------------------------------------------------
##                                     Est.   S.E.   t val.      p
## -------------------------------- ------- ------ -------- ------
## (Intercept)                         2.09   0.09    24.50   0.00
## promotermCMV                        1.60   0.10    16.73   0.00
## background2                        -0.88   0.10    -9.15   0.00
## background3                         0.53   0.10     5.49   0.00
## spacing_degree_transf               0.19   0.04     4.84   0.00
## affinity_id3_med_only              -0.04   0.08    -0.52   0.60
## affinity_id5_low_only               1.41   0.08    17.97   0.00
## affinity_id7_very-low_only         -0.26   0.08    -3.32   0.00
## promotermCMV:background2            0.19   0.14     1.42   0.16
## promotermCMV:background3           -1.15   0.13    -8.53   0.00
## ---------------------------------------------------------------

Conlusion: Top reporters are better than commercial reporters. Linear model gives insights into which features are important to drive high expression.

Session Info

paste("Run time: ",format(Sys.time()-StartTime))
## [1] "Run time:  33.83827 secs"
getwd()
## [1] "/DATA/usr/m.trauernicht/projects/P53_reporter_scan/analyses"
date()
## [1] "Tue Jun 13 18:35:39 2023"
sessionInfo()
## R version 4.0.5 (2021-03-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.6 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] scales_1.2.0        ggrastr_1.0.1       jtools_2.1.4       
##  [4] glmnetUtils_1.1.8   glmnet_4.1-4        Matrix_1.5-1       
##  [7] randomForest_4.6-14 plotly_4.10.0       ROCR_1.0-11        
## [10] tidyr_1.2.0         stringr_1.4.0       readr_2.1.2        
## [13] GGally_2.1.2        gridExtra_2.3       cowplot_1.1.1      
## [16] plyr_1.8.7          viridis_0.6.2       viridisLite_0.4.0  
## [19] ggforce_0.3.3       ggbeeswarm_0.6.0    ggpubr_0.4.0       
## [22] pheatmap_1.0.12     tibble_3.1.6        maditr_0.8.3       
## [25] dplyr_1.0.8         ggplot2_3.4.0       RColorBrewer_1.1-3 
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-152      bit64_4.0.5       httr_1.4.2        tools_4.0.5      
##  [5] backports_1.4.1   bslib_0.3.1       utf8_1.2.2        R6_2.5.1         
##  [9] vipor_0.4.5       mgcv_1.8-34       DBI_1.1.2         lazyeval_0.2.2   
## [13] colorspace_2.0-3  withr_2.5.0       tidyselect_1.1.2  bit_4.0.4        
## [17] compiler_4.0.5    cli_3.4.1         Cairo_1.5-15      labeling_0.4.2   
## [21] sass_0.4.1        digest_0.6.29     rmarkdown_2.13    pkgconfig_2.0.3  
## [25] htmltools_0.5.2   highr_0.9         fastmap_1.1.0     htmlwidgets_1.5.4
## [29] rlang_1.0.6       rstudioapi_0.13   shape_1.4.6       jquerylib_0.1.4  
## [33] farver_2.1.0      generics_0.1.2    jsonlite_1.8.0    vroom_1.5.7      
## [37] car_3.0-12        magrittr_2.0.3    Rcpp_1.0.8.3      munsell_0.5.0    
## [41] fansi_1.0.3       abind_1.4-5       lifecycle_1.0.3   stringi_1.7.6    
## [45] yaml_2.3.5        carData_3.0-5     MASS_7.3-53.1     grid_4.0.5       
## [49] parallel_4.0.5    crayon_1.5.1      lattice_0.20-41   splines_4.0.5    
## [53] pander_0.6.5      hms_1.1.1         knitr_1.38        pillar_1.7.0     
## [57] ggsignif_0.6.3    codetools_0.2-18  glue_1.6.2        evaluate_0.15    
## [61] data.table_1.14.2 vctrs_0.5.1       tzdb_0.3.0        tweenr_1.0.2     
## [65] foreach_1.5.2     gtable_0.3.0      purrr_0.3.4       polyclip_1.10-0  
## [69] reshape_0.8.9     assertthat_0.2.1  xfun_0.30         broom_0.8.0      
## [73] rstatix_0.7.0     survival_3.2-10   iterators_1.0.14  beeswarm_0.4.0   
## [77] ellipsis_0.3.2